Outline

Project Aims

Aim 1: Predict degree of improvement in ruminative and depressive symptoms: RRS & HDRS-6.

Aim 2: Determine which treatment-specific models predict across treatment arms.

Outcomes

  • Rumination Reflection Scale - Reflection dimension (RRSR)

  • Hamilton Depression Rating Scale (HDRS) 6-item Version
    • Depressed Mood
    • Guilt
    • Work & Interests
    • Psychomotor Retardation
    • Psychic Anxiety
    • General Somatic Symptoms

Sample outline

Demographics Table
group n mean_age sd_age percent_male hdrs6_baseline hdrs6_percent_change rrsr_baseline rrsr_percent_change
e 25 39.44000 13.72067 0.4400000 11.320000 -0.2862191 10.92000 -0.0512821
k 48 38.52083 10.64313 0.5208333 11.541667 -0.5361011 10.89583 -0.1491396
s 37 33.37838 10.77124 0.4324324 9.783784 -0.3093923 12.24324 -0.1346578
Treatment-by-HDRS % Change ANOVA Table
diff lwr upr p adj
k-e -0.2104417 -0.4127054 -0.0081780 0.0394338
s-e -0.0188892 -0.2312001 0.1934216 0.9756563
s-k 0.1915525 0.0121472 0.3709577 0.0334470
Treatment-by-RRSR % Change ANOVA Table
diff lwr upr p adj
k-e -0.1398505 -0.3340561 0.0543550 0.2055795
s-e -0.1515761 -0.3554285 0.0522763 0.1855638
s-k -0.0117256 -0.1839834 0.1605322 0.9856754

Predictors

  • Demographics
    • Age
    • Sex
    • Duration of lifetime illness in years
    • Number of previous episodes
  • Baseline Symptom Severity
    • Models with and without to determine reliance on baseline values
  • Multimodal Imaging
    • Regional Cortical Thickness
    • Subcortical Volume
    • Diffusion (FA, RD, AD, MD, Kurtosis)
    • Between-network ICA-based connectivity
    • Global connectivity of ICA nodes
Global Connectivity

Global Connectivity

Modeling approach

  1. Coarse grid search over hyperparameters
    • Pick classifier: Random Forest vs. SVM vs. Gradient Boosted Trees (simple coarse grid search for each with 10-fold cross validation)
    • Degree of feature filtering: We have ~16K features. I removed highly correlated features outright with |r| cutoffs {.1, .2, .3, .4, .5}
  2. Given results of coarse search, tune model-specific parameters and process hyperparameters more finely
    • Manual approach with more hands-on feature selection
      • Removal of global connectivity features with zero mode and little variation
      • Removal of features with near-zero variance (similar to above)
      • Iterative tuning of model-specific parameters: Bigger knobs first.
    • Adaptive grid search with GLS (Guided Local Search?) algorithm
      • Removal of global connectivity features with zero mode and little variation
      • Removal of features with near-zero variance (similar to above)
      • Joint removal of highly correlated features

Machine learning basics

  • Supervised machine learning is the process of training a model on labeled data and using it to make predictions in data not used for training
  • Hundreds of models available that vary by complexity and purpose (classification, regression, clustering)
  • Models have specific sets of parameters like beta weights in linear regreession. These require tuning to find optimal settings/combinations.
  • The process of finding optimal parameter combinations is usually done with some form of grid search Grid search example
  • Bias-variance tradeoff:
    • Overly simple models usually don’t capture complex patterns in the data and are said to be ‘underfit’ and have ‘high bias’.
    • Overly complex models capture noise/idiosyncrasies of the training data and are said to be ‘overfit’ or have ‘high variance’.
Under and overfitting

Under and overfitting

  • Finding the right amount of complexity is usually done with cross validation
    • Train the model on a majority of the dataset
    • Test fitted model on held-out data
    • Usually this is repeated many times and results are averaged
Cross validation

Cross validation

Coarse grid search: Classifier comparison

Result: Gradient boosted trees consistently outperformed RFs and SVMs. Lower |r| thresholds tend to perform better.

Gradient Boosted Trees Illustration

Gradient Boosted Trees Illustration

Unlike RFs and SVMs, gradient boosted trees have a lot of parameters to tune and therefore grid searche spaces are extensive. Manual tuning is difficult so I’ll compare that performance to adaptive grid searches.

Gradient Boosted Tree Outline

Within-arm predictions: Manual grid search results

Number of Parameterizations per Model
FALSE TRUE
rrsr 2304 2304
hdrs6 2304 2304

Within-arm predictions: Adaptive grid search results

Comparison of manual and adaptive model performances

Results: Biggest gains for adaptive approach are for ECT arm. Adaptive method does poorly with sleep deprivation arm.

Outline of top-performing models

ECT: RRS, Reflection; baseline symptoms included

## TableGrob (2 x 3) "arrange": 6 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]

## NULL

ECT: HDRS-6; baseline symptoms included

## TableGrob (2 x 3) "arrange": 6 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]

## NULL

Ketamine: RRS, Reflection; baseline symptoms included

## TableGrob (2 x 3) "arrange": 6 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]

## NULL

Ketamine: HDRS-6; baseline symptoms included

## TableGrob (2 x 3) "arrange": 6 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]

Sleep Deprivation: RRS, Reflection; baseline symptoms included

## TableGrob (2 x 3) "arrange": 6 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]

## NULL

Sleep Deprivation: HDRS-6; baseline symptoms included

## TableGrob (2 x 3) "arrange": 6 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]

## NULL

Error Analysis of Best Models

Approach: Median split subjects by their RMSE across 10-repeated 10-fold cross validation folds. Compare demographic and imaging characteristics of subjects with high and low errors.

ECT RRSR

## Using error as id variables

Results: The model struggled with participants with higher baseline symptoms who subsequently reduced symptoms more, younger participants. Imaging measures were similarly distributed across high and low error participants. Structural imaging measures more similar than functional measures due to a few outliers.

ECT HDRS

## Using error as id variables

Results: The model did worse with participants that had fewer baseline symptoms (unlike the RRSR) and who were older with a longer symptom duration. Global connectivity measures distributed more orthogonally between error categories.

Ketamine RRSR

## Using error as id variables

Results: Model did worse with participants that improved more but baseline symptoms didn’t make a difference. Age, symptom duration, and number of prior episodes somewhat different between classes. Barring a few outliers in functional data, imaging measures were similar.

Ketamine HDRS

## Using error as id variables

Results: Similarly, model did worse with participants that did better. Age remains different. Global connectivity measures more orthogonal in PCA space.

TSD RRSR

## Using error as id variables

Results: Demographic & clinical measures strongly differe by error class. Functional imaging space skewed by several outiers.

TSD HDRS

## Using error as id variables

Results: Same as RRSR.

Performance by Confidence Level

Maybe if we only offer predictions about patients that we have a lot of confidence about, the models will do better overall? My naive approach was to say the model is more confident about a patient when their repeated predictions across cross validation are more similar, i.e., lower standard deviation.

Result: Seems like PvAc is higher when we only predict high-variance patients.

Next Steps